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The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1-26 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
MERRILL et al. (6,369,821) in view of SUTTON et al. (6,539,354). 

As per claim 1, Merrill teaches the claimed "communication system for 
performing a conversation with an actual or fictional human, animal, doll, character or 
the like virtualized by using a computer, comprising: a client and a server" (Merrill, 
column 5, lines 32-46; column 17, lines 9-20), wherein "the client includes: an input 
portion for inputting a first message addressed from a user to the human or the like; a 
transmitting portion for transmitting the first message (Merrill, column 3, line 30 to 
column 4, line 10; column 33, lines 32-40 - computer 20 with access to a network); a 
receiving portion for receiving a second message and facial animation of the human or 
the like, the second message being addressed from the human or the like to the user as 
a response to the first message (Merrill, column 15, lines 40-65; column 35, lines 25-40 
- request and receive the animation data); an output portion for outputting the second 
message to the user; and a display portion for displaying the facial animation" (Merrill, 
column 36, line 66 to column 37, line 10), and "the server includes: a storing portion for 
storing facial image data of the human or the like, a receiving portion for receiving the 
first message (Merrill, column 33, lines 53-66 - a remote computer provides the 
animation data), a first generating portion for generating the second message in 
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response to the reception of the first message, a second generating portion for 
generating motion control data, a third generating portion for generating the facial 
animation (Merrill, column 11, lines 44-61; column 23, lines 1-30; column 35, lines 28-60 
- generation of the animation data of a character); and a transmitting portion for 
transmitting the second message and the facial animation" (Merrill, column 33, line 41 to 
column 34, line 1 8). It is noted that Merrill does not teach the facial animation data is 
"based on the motion control data and the facial image data". Sutton teaches that in 
speech synchronization, the facial animation data is "based on the motion control data 
and the facial image data" (Sutton, column 8, lines 6-49). It would have been obvious to 
a person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control and facial image data, provides a realistic, and natural visual 
appearance of human speech in which the voice and the human facial vision 
synchronously represents the speech. 

Claim 2 adds into claim 1 "the server is provided with a storing portion for storing 
person information as information concerning the human or the like, and the first 
generating portion generates the second message with reference to the person 
information concerning the human or the like" (Merrill, column 15, lines 40-46). 

Claim 3 adds into claim 2 "the server is provided with a storing portion for storing 
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sentence information as information for generating a conversation sentence, and the 
first generating portion extracts such sentence information that are likely to be used for 
a response from the human or the like to the first message and generates the second 
message" (Merrill, column 23, lines 1-9). 

Claim 4 adds into claim 1 wherein the facial image data are data represented by 
a three-dimensional model so structured as to move, and the third generating portion 
causes a structured part of the three-dimensional model to move based on the motion 
control data, which Merrill does not explicitly teach. However, Sutton teaches that in 
speech synchronization, the facial image data are data represented by a three- 
dimensional model and the facial animation data is "based on the motion control data" 
(Sutton, column 3, lines 12-18; column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control, provides a realistic, and natural visual appearance of human speech 
in which the voice and the human facial vision synchronously represents the speech. 

As per claim 5, Merrill teaches the claimed "communication system for 
performing a conversation with an actual or fictional human, animal, doll, character or 
the like virtualized by using a computer, comprising: a client and a server" (Merrill, 



0 
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column 5, lines 32-46; column 17, lines 9-20), wherein "the client includes: an input 
portion for inputting a first message addressed from a user to the human or the like; a 
transmitting portion for transmitting the first message (Merrill, column 3, line 30 to 
column 4, line 10; column 33, lines 32-40 - computer 20 with access to a network); a 
receiving portion for receiving a second message and facial animation of the human or 
the like, the second message being addressed from the human or the like to the user as 
a response to the first message (Merrill, column 15, lines 40-65; column 35, lines 25-40 
- request and receive the animation data); an output portion for outputting a second 
message to the user, the second message being addressed from the human or the like 
to the user as a response to the first message; and a display portion for displaying the 
facial animation" (Merrill, column 36, line 66 to column 37, line 10), and "the server 
includes: a storing portion for storing facial image data of the human or the like, a 
receiving portion for receiving the first message (Merrill, column 33, lines 53-66 - a 
remote computer provides the animation data), a first generating portion for generating 
the second message in response to the reception of the first message, a second 
generating portion for generating motion control data (Merrill, column 1 1 , lines 44-61 ; 
column 23, lines 1-30; column 35, lines 28-60 - generation of the animation data of a 
character); and a transmitting portion for transmitting the second message and the 
motion control data" (Merrill, column 33, line 41 to column 34, line 18). It is noted that 
Merrill does not teach "a generating portion for generating facial animation of the human 
or the like based on the motion control data and the facial image data". Sutton teaches 
that in speech synchronization, the facial animation data is "based on the motion control 
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data and the facial image data" (Sutton, column 8, lines 6-49). It would have been 
obvious to a person of ordinary skill in the art at the time the invention was made, in 
view of the teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as 
claimed because the generation of facial animation of a speech, in which the animation 
is based on motion control and facial image data, provides a realistic, and natural visual 
appearance of human speech in which the voice and the human facial vision 
synchronously represents the speech. 

Claim 6 adds into claim 5 "wherein the server is provided with a storing portion 
for storing person information as information concerning the human or the like, and the 
first generating portion generates the second message with reference to the person 
information concerning the human or the like" (Merrill, column 15, lines 40-46). 

Claim 7 adds into claim 6 "wherein the server is provided with a storing portion 
for storing sentence information as information for generating a conversation sentence, 
and the first generating portion extracts such sentence information that are likely to be 
used for a response from the human or the like to the first message and generates the 
second message" (Merrill, column 23, lines 1-9). 

As per claim 8, Merrill teaches the claimed "communication system for 
performing a conversation with an actual or fictional human, animal, doll, character or 
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the like virtualized by using a computer, comprising: a client and a server" (Merrill, 
column 5, lines 32-46; column 17, lines 9-20), wherein "the client includes: a storing 
portion for storing facial image data of the human or the like; an input portion for 
inputting a first message addressed from a user to the human or the like; a transmitting 
portion for transmitting the first message (Merrill, column 3, line 30 to column 4, line 10; 
column 33, lines 32-40 - computer 20 with access to a network); a receiving portion for 
receiving the second message, the facial image data and motion control data (Merrill, 
column 15, lines 40-65; column 35, lines 25-40 - request and receive the animation 
data); an output portion for outputting a second message to the user, the second 
message being addressed from the human or the like to the user as a response to the 
first message; and a display portion for displaying the facial animation" (Merrill, column 
36, line 66 to column 37, line 10), and "the server includes: a receiving portion for 
receiving the first message (Merrill, column 33, lines 53-66 - a remote computer 
provides the animation data), a first generating portion for generating the second 
message in response to the reception of the first message; a second generating portion 
for generating the motion control data (Merrill, column 11, lines 44-61; column 23, lines 
1-30; column 35, lines 28-60 - generation of the animation data of a character); ; and a 
transmitting portion for transmitting the second message and the motion control data" 
(Merrill, column 33, line 41 to column 34, line 18). It is noted that Merrill does not teach 
a generating portion for generating facial animation of the human or the like based on 
the motion control data and the facial image data. Sutton teaches that in speech 
synchronization, the facial animation data is "based on the motion control data and the 
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facial image data" (Sutton, column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control and facial image data, provides a realistic, and natural visual 
appearance of human speech in which the voice and the human facial vision 
synchronously represents the speech. 

Claim 9 adds into claim 8 "wherein the server is provided with a storing portion 
for storing person information as information concerning the human or the like, and the 
first generating portion generates the second message with reference to the person 
information concerning the human or the like" (Merrill, column 15, lines 40-46). 

Claim 10 adds into claim 9 "wherein the server is provided with a storing portion 
for storing sentence information as information for generating a conversation sentence, 
and the first generating portion extracts such sentence information that are likely to be 
used for a response from the human or the like to the first message and generates the 
second message" (Merrill, column 23, lines 1-9). 

As per claim 1 1 , Merrill teaches the claimed "server used for a communication 
system for performing a conversation with an actual or fictional human, animal, doll, 
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character or the like virtualized by using a computer" (Merrill, column 5, lines 32-46; 
column 17, lines 9-20), the server comprising: a storing portion for storing facial image 
data of the human or the like; a receiving portion for receiving a first message 
addressed from a user to the human or the like (Merrill, column 33, lines 53-66 - a 
remote computer provides the animation data), a first generating portion for generating 
a second message, the second massage being addressed from the human or the like to 
the user as a response to the first message; a second generating portion for generating 
motion control data; a third generating portion for generating facial animation (Merrill, 
column 1 1 , lines 44-61 ; column 23, lines 1-30; column 35, lines 28-60 - generation of 
the animation data of a character); and a transmitting portion for transmitting the second 
message and the facial animation" (Merrill, column 33, line 41 to column 34, line 18). It 
is noted that Merrill does not teach the facial animation data is "based on the motion 
control data and the facial image data". Sutton teaches that in speech synchronization, 
the facial animation data is "based on the motion control data and the facial image data" 
(Sutton, column 8, lines 6-49). It would have been obvious to a person of ordinary skill 
in the art at the time the invention was made, in view of the teaching of Sutton (column 
6, lines 42-62), to configure Merrill's system as claimed because the generation of facial 
animation of a speech, in which the animation is based on motion control and facial 
image data, provides a realistic, and natural visual appearance of human speech in 
which the voice and the human facial vision synchronously represents the speech. 



Claim 12 adds into claim 1 1 wherein the facial image data are data represented 
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by a three-dimensional model so structured as to move, and the third generating portion 
causes a structured part of the three-dimensional model to move based on the motion 
control data which Merrill does not explicitly teach. However, Sutton teaches that in 
speech synchronization, the facial image data are data represented by a three- 
dimensional model and the facial animation data is "based on the motion control data" 
(Sutton, column 3, lines 12-18; column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control, provides a realistic, and natural visual appearance of human speech 
in which the voice and the human facial vision synchronously represents the speech. 

As per claim 13, Merrill teaches the claimed "server used for a communication 
system for performing a conversation with an actual or fictional human, animal, doll, 
character or the like virtualized by using a computer" (Merrill, column 5, lines 32-46; 
column 17, lines 9-20), the server comprising: a storing portion for storing facial image 
data of the human or the like; a receiving portion for receiving a first message 
addressed from a user to the human or the like (Merrill, column 33, lines 53-66 - a 
remote computer provides the animation data), a first generating portion for generating 
a second message, the second message being addressed from the human or the like to 
the user as a response to the first message; a second generating portion for generating 
motion control data (Merrill, column 1 1 , lines 44-61 ; column 23, lines 1-30; column 35, 
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lines 28-60 - generation of the animation data of a character); a transmitting portion for 
transmitting the second message and the motion control data" (Merrill, column 33, line 
41 to column 34, line 18). It is noted that Merrill does not explicitly teach the facial image 
data is "moved in accordance on the motion control data". Sutton teaches that in 
speech synchronization, the facial image data is "based on the motion control data and 
the facial image data" (Sutton, column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control and facial image data, provides a realistic, and natural visual 
appearance of human speech in which the voice and the human facial vision 
synchronously represents the speech. 



As per claim 14, Merrill teaches the claimed "server used for a communication 
system for performing a conversation with an actual or fictional human or like virtualized 
by using a computer" (Merrill, column 5, lines 32-46; column 17, lines 9-20), the server 
comprising: a receiving portion for receiving a first message addressed from a user to 
the human or the like (Merrill, column 33, lines 53-66 - a remote computer provides the 
animation data), a first generating portion for generating a second message, the second 
message being addressed from the human or the like to the user as a response to the 
first message; a second generating portion for generating motion control (Merrill, 
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column 11, lines 44-61; column 23, lines 1-30; column 35, lines 28-60 - generation of 
the animation data of a character); and a transmitting portion for transmitting the second 
message and the motion control data" (Merrill, column 33, line 41 to column 34, line 18). 
It is noted that Merrill does not explicitly teach the facial image data is "moved in 
accordance on the motion control data". Sutton teaches that in speech synchronization, 
the facial image data is "based on the motion control data and the facial image data" 
(Sutton, column 8, lines 6-49). It would have been obvious to a person of ordinary skill 
in the art at the time the invention was made, in view of the teaching of Sutton (column 
6, lines 42-62), to configure Merrill's system as claimed because the generation of facial 
animation of a speech, in which the animation is based on motion control and facial 
image data, provides a realistic, and natural visual appearance of human speech in 
which the voice and the human facial vision synchronously represents the speech. 



As per claim 15, Merrill teaches the claimed "client used for a communication 
system for performing a conversation with an actual or fictional human, animal, doll, 
character or the like virtualized by using a computer" (Merrill, column 5, lines 32-46; 
column 17, lines 9-20), wherein ", the client comprising: an input portion for inputting a 
first message addressed from a user to the human or the like; a transmitting portion for 
transmitting the first message (Merrill, column 3, line 30 to column 4, line 10; column 33, 
lines 32-40 - computer 20 with access to a network); a receiving portion for receiving 
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the second message, facial image data indicating a face of the human by using image 
data and motion control data for causing the facial image data to move in accordance 
with the second message (Merrill, column 15, lines 40-65; column 35, lines 25-40 - 
request and receive the animation data); an output portion for outputting a second 
message, the second message being addressed from the human or the like to the user 
as a response to the first message; a generating portion for generating facial animation 
of the human or the like; and a display portion for displaying the facial animation" 
(Merrill, column 36, line 66 to column 37, line 10). It is noted that Merrill does not 
teach the facial animation data is "based on the motion control data and the facial image 
data". Sutton teaches that in speech synchronization, the facial animation data is 
"based on the motion control data and the facial image data" (Sutton, column 8, lines 6- 
49). It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made, in view of the teaching of Sutton (column 6, lines 42-62), to 
configure Merrill's system as claimed because the generation of facial animation of a 
speech, in which the animation is based on motion control and facial image data, 
provides a realistic, and natural visual appearance of human speech in which the voice 
and the human facial vision synchronously represents the speech. 

Claim 16 adds into claim 15 wherein the facial image data are data represented 
by a three-dimensional model so structured as to move, and the third generating portion 
causes a structured part of the three-dimensional model to move based on the motion 
control data which Merrill does not explicitly teach. However, Sutton teaches that in 
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speech synchronization, the facial image data are data represented by a three- 
dimensional model and the facial animation data is "based on the motion control data" 
(Sutton, column 3, lines 12-18; column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control, provides a realistic, and natural visual appearance of human speech 
in which the voice and the human facial vision synchronously represents the speech. 

As per claim 17, Merrill teaches the claimed "communication system for 
performing a conversation with watching a partner's animation comprising: a host 
computer and a plurality of terminal devices" (Merrill, column 5, lines 32-46; column 17, 
lines 9-20), wherein each of the terminal devices includes: a transmission and reception 
portion for transmitting and receiving a message (Merrill, column 3, line 30 to column 4, 
line 10; column 33, lines 32-40 - computer 20 with access to a network); a first 
receiving portion for receiving image data, a second receiving portion for receiving 
motion control data (Merrill, column 15, lines 40-65; column 35, lines 25-40 - request 
and receive the animation data); a display portion for displaying animation generated by 
moving the image data" (Merrill, column 36, line 66 to column 37, line 10), and a host 
computer includes: a receiving portion for receiving a message (Merrill, column 33, lines 
53-66 - a remote computer provides the animation data), a generating portion for 
generating the motion control data based on the translated message (Merrill, column 
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1 1 , line 62 to column 12, line 2; column 23, lines 1-30; column 35, lines 28-60 - 
generation of the animation data of a character's speech); a translation portion for 
translating the received message into another natural language (Merrill, column 23, 
lines 9-28); a first transmitting portion for transmitting the translated message and a 
second transmitting portion for transmitting the image data and the motion control data 
of one of the terminal devices in communication to another one of the terminal devices 
in the communication" (Merrill, column 33, line 41 to column 34, line 18). It is noted 
that Merrill does not teach the facial animation data is "based on the motion control 
data". Sutton teaches that in speech synchronization, the facial animation data is 
"based on the motion control data" (Sutton, column 8, lines 6-49). It is also noted that 
Merrill does not teach the message is a voice. Sutton also teaches in details the 
speech synthesis which translate the message in form of voice into another natural 
language (Sutton, column 17, lines 33-43). It would have been obvious to a person of 
ordinary skill in the art at the time the invention was made, in view of the teaching of 
Sutton (column 6, lines 42-62), to configure Merrill's system as claimed because the 
generation of facial animation of a speech, in which the animation is based on motion 
control, provides a realistic, and natural visual appearance of human speech in which 
the voice and the human facial vision synchronously represents the speech. 

Claim 18 adds into claim 17 wherein the facial image data are data represented 
by a three-dimensional model so structured as to move, and the third generating portion 
causes a structured part of the three-dimensional model to move based on the motion 
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control data which Merrill does not explicitly teach. However, Sutton teaches that in 
speech synchronization, the facial image data are data represented by a three- 
dimensional model and the facial animation data is "based on the motion control data" 
(Sutton, column 3, lines 12-18; column 8, lines 6-49). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control, provides a realistic, and natural visual appearance of human speech 
in which the voice and the human facial vision synchronously represents the speech. 



As per claim 19, Merrill teaches the claimed "host computer used for a 
communication system for performing a conversation with watching partner's animation" 
(Merrill, column 5, lines 32-46; column 17, lines 9-20), the host computer comprising: a 
transmission and reception portion for transmitting and receiving a message (Merrill, 
column 33, lines 53-66 - a remote computer provides the animation data), a generating 
portion for generating motion control data used for making facial image data move 
(Merrill, column 11, line 62 to column 12, line 2; column 23, lines 1-30; column 35, lines 
28-60- generation of the animation data of a character's speech including the 
phonemes of the synthesized audio); a translation portion for translating the received 
voice into another natural language (Merrill, column 23, lines 9-28); a first transmitting 
portion for transmitting the translated message; and a second transmitting portion for 
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transmitting the image data and the motion control data of one of the terminal devices in 
communication to another one of the terminal devices in the communication" (Merrill, 
column 33, line 41 to column 34, line 1 8). It is noted that Merrill does not teach the 
facial animation data is "based on the motion control data and the translated message". 
Sutton teaches that in speech synchronization, the facial animation data is "based on 
the motion control data and the translated message" (Sutton, column 8, lines 6-49). It is 
also noted that Merrill does not teach the message is a voice. Sutton also teaches in 
details the speech synthesis which translate the message in form of voice into another 
natural language (Sutton, column 17, lines 33-43). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the animation is based 
on motion control, provides a realistic, and natural visual appearance of human speech 
in which the voice and the human facial vision synchronously represents the speech. 

As per claim 20, Merrill teaches the claimed "communication system for 
performing a conversation with watching partner's animation, comprising: a host 
computer and a plurality of terminal devices" (Merrill, column 5, lines 32-46; column 17, 
lines 9-20), wherein each of the terminal devices includes: a first transmission and 
reception portion for transmitting and receiving a message (Merrill, column 3, line 30 to 
column 4, line 10; column 33, lines 32-40 - computer 20 with access to a network); a 
storing portion for storing image data; a second transmission and reception portion for 
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transmitting and receiving the image data (Merrill, column 15, lines 40-65; column 35, 
lines 25-40 - request and receive the animation data); a generating portion for 
generating motion control data; and a display portion for displaying animation" (Merrill, 
column 36, line 66 to column 37, line 10), and the host computer includes: a receiving 
portion for receiving a message (Merrill, column 33, lines 53-66 - a remote computer 
provides the animation data), a translation portion for translating the received message 
into another natural language (Merrill, column 11, lines 44-61; column 23, lines 1-30; 
column 35, lines 28-60 - generation of the animation data of a character including the 
phonemes of the synthesized audio); and a transmitting portion for transmitting the 
translated message" (Merrill, column 33, line 41 to column 34, line 18). It is noted that 
Merrill does not teach the facial animation data is "based on the motion control data and 
the translated message". Sutton teaches that in speech synchronization, the facial 
animation data is "based on the motion control data and the translated message" 
(Sutton, column 8, lines 6-49). It is also noted that Merrill does not teach the message 
is a voice. Sutton also teaches in details the speech synthesis which translate the 
message in form of voice into another natural language (Sutton, column 17, lines 33- 
43). It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made, in view of the teaching of Sutton (column 6, lines 42-62), to 
configure Merrill's system as claimed because the generation of facial animation of a 
speech, in which the animation is based on motion control, provides a realistic, and 
natural visual appearance of human speech in which the voice and the human facial 
vision synchronously represents the speech. 
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As per claim 21, Merrill teaches the claimed "communication method" (Merrill, 
column 5, lines 32-46; column 17, lines 9-20), comprising the steps of: preparing 
animation in a first terminal device connected to a network; transmitting a message 
signal of a sentence comprised in a natural language from a second terminal device to a 
host computer via the network (Merrill, column 3, line 30 to column 4, line 10; column 
33, lines 32-40 - computer 20 with access to a network); receiving the sentence of the 
transmitted message signal in the host computer so as to translate the sentence into a 
sentence comprising another language, generating a voice signal corresponding to the 
translated sentence, generating a motion control signal of animation corresponding to 
the voice signal of the translated sentence (Merrill, column 11, lines 44-61; column 23, 
lines 1-30; column 35, lines 28-60 - generation of the animation data of a character 
including the phonemes of the synthesized audio); and transmitting the generated voice 
signal and the generated motion control signal from the host computer to the first 
terminal device via the network; and receiving the transmitted voice signal and the 
transmitted motion control signal in the first terminal device so as to output a voice 
corresponding to the voice signal" (Merrill, column 33, line 41 to column 34, line 18). It 
is noted that Merrill does not teach the facial animation data is "based on the motion 
control signal". Sutton teaches that in speech synchronization, the facial animation data 
is "based on the motion control signal" (Sutton, column 8, lines 6-49). It is also noted 
that Merrill does not teach the message signal is a voice signal. Sutton also teaches in 
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details the speech synthesis, which translate the message in form of voice into another 
natural language (Sutton, column 17, lines 33-43). 

Claim 22 adds into claim 21 " wherein the animation indicates a face of a human" 
which Merrill teaches in column 4, lines 26-28. 

Claim 23 adds into claim 22 "wherein the motion control signal is a signal for 
controlling a motion of a mouth of the animation corresponding to the translated 
sentence" which Merrill does not explicitly teach. However, Sutton teaches that in 
speech synchronization, the facial animation data including the mouth motion is "based 
on the motion control data" (Sutton, column 3, lines 12-18; column 8, lines 6-49). It 
would have been obvious to a person of ordinary skill in the art at the time the invention 
was made, in view of the teaching of Sutton (column 6, lines 42-62), to configure 
Merrill's system as claimed because the generation of facial animation of a speech, in 
which the animation is based on motion control, provides a realistic, and natural visual 
appearance of human speech in which the voice and the human facial vision 
synchronously represents the speech. 

Claim 24 adds into claim 21 "the animation moves in accordance with the output 
of the voice" which Merrill does not explicitly teach. However, Sutton teaches that in 
speech synchronization, the facial animation data is "based on the motion control data 
related to the output voice" (Sutton, column 3, lines 12-18; column 8, lines 6-49). It 
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would have been obvious to a person of ordinary skill in the art at the time the invention 
was made, in view of the teaching of Sutton (column 6, lines 42-62), to configure 
Merrill's system as claimed because the generation of facial animation of a speech, in 
which the animation is based on motion control related to output voice, provides a 
realistic, and natural visual appearance of human speech in which the voice and the 
human facial vision synchronously represents the speech. 

As per claim 25, Merrill teaches the claimed "communication method" (Merrill, 
column 5, lines 32-46; column 17, lines 9-20), comprising the steps of: receiving a 
message signal of a sentence comprised in a natural language from a terminal device 
(Merrill, column 33, lines 53-66 - a remote computer provides the animation data), 
translating the sentence of the received message signal into a sentence comprising 
another natural language; (Merrill, column 11, lines 44-61; column 23, lines 1-30; 
column 35, lines 28-60 - generation of the animation data of a character including the 
phonemes of the synthesized audio); generating a voice signal corresponding to the 
translated sentence; generating a motion control signal of animation corresponding to 
the generated voice signal (Merrill, column 23, lines 14-30); transmitting the generated 
voice signal and the generated motion control signal to another terminal device" (Merrill, 
column 33, line 41 to column 34, line 18). It is noted that Merrill does not teach the 
message is a voice. Sutton also teaches in details the speech synthesis which translate 
the message in form of voice into another natural language (Sutton, column 17, lines 
33-43). It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made, in view of the teaching of Sutton (column 6, lines 42-62), to 
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configure Merrill's system as claimed because the generation of facial animation of a 
speech, in which the input speech signal is a voice signal, provides a realistic, and 
natural visual appearance of human speech in which the voice and the human facial 
vision synchronously represents the speech. 



As per claim 26, Merrill teaches the claimed "communication method" (Merrill, 
column 5, lines 32-46; column 17, lines 9-20), comprising the steps of: receiving a 
message signal of a sentence comprised in a natural language from a terminal device 
(Merrill, column 33, lines 53-66 - a remote computer provides the animation data), 
translating the sentence of the received message signal into a sentence comprising 
another natural language; (Merrill, column 11, lines 44-61; column 23, lines 1-30; 
column 35, lines 28-60 - generation of the animation data of a character including the 
phonemes of the synthesized audio); generating a voice signal corresponding to the 
translated sentence(Merrill, column 23, lines 14-30); transmitting the generated voice 
signal to another terminal device" (Merrill, column 33, line 41 to column 34, line 18). It 
is noted that Merrill does not teach the message is a voice. Sutton also teaches in 
details the speech synthesis which translate the message in form of voice into another 
natural language (Sutton, column 17, lines 33-43). It would have been obvious to a 
person of ordinary skill in the art at the time the invention was made, in view of the 
teaching of Sutton (column 6, lines 42-62), to configure Merrill's system as claimed 
because the generation of facial animation of a speech, in which the input speech signal 
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is a voice signal, provides a realistic, and natural visual appearance of human speech in 
which the voice and the human facial vision synchronously represents the speech. 



The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

Regarding claims 1, 2, 5, 6, 8, 9, 11, 13, 14, 15, .... the phrase "or the like" 

renders the claim(s) indefinite because the claim(s) include(s) elements not actually 

disclosed (those encompassed by "or the like"), thereby rendering the scope of the 

claim(s) unascertainable. See MPEP § 2173.05(d). Similarly, in claims 3, 7, 10, the 

word "likely" renders the claims indefinite. 



Application/Control Number: 09/878,207 Page 24 

Art Unit: 2671 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Phu K. Nguyen whose telephone number is (703)305 - 
9796. The examiner can normally be reached on M-F 8:00-4:30. 

The fax phone number for the organization where this application or proceeding 
is assigned is 703-872-9306. Information regarding the status of an application may be 
obtained from the Patent Application Information Retrieval (PAIR) system. Status 
information for published applications may be obtained from either Private PAIR or 
Public PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR system, 
contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
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